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Generic minimizing behavior in semi-algebraic optimization 


D. Drusvyatskiy* A.D. Ioffcb A.S. Lewis^ 


Abstract 

We present a theorem of Sard type for semi-algebraic set-valued mappings whose 
graphs have dimension no larger than that of their range space: the inverse of such a 
mapping admits a single-valued analytic localization around any pair in the graph, 
for a generic value parameter. This simple result yields a transparent and unified 
treatment of generic properties of semi-algebraic optimization problems: “typical” 
semi-algebraic problems have finitely many critical points, around each of which they 
admit a unique “active manifold” (analogue of an active set in nonlinear optimiza¬ 
tion); moreover, such critical points satisfy strict complementarity and second-order 
sufficient conditions for optimality are indeed necessary. 


1 Introduction 

Many problems of contemporary interest can broadly be phrased as an inverse prob¬ 
lem: given a vector y in R m find a point x satisfying the inclusion 

V € F(x), 

where F : R" R m is some set-valued mapping (a mapping taking elements of R™ 
to subsets of R m ) arising from the problem at hand. In other words, we would like 
to find a point x such that the pair (x, y) lies in the graph 

gphF := {{x,y) : y <E F(x)}. 

Stability analysis of such problems then revolves around understanding sensitivity of 
the solution set F~ * l (y) near x to small perturbations in y. An extremely desirable 
property is for F to be strongly regular |40) Section 3G] at a pair (x,y) in gphT 1 , 
meaning that the graph of the inverse F~ 1 coincides locally around (y, x) with the 
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graph of a single-valued Lipschitz continuous function g: R ,n —>• R n . Naturally, 
then vectors y for which there exists a solution x £ F~ l (y) so that F is not strongly 
regular at (x, y) are called weak critical values of F. We begin this work by asking 
the following question of Sard type: 

Which mappings F : R n R m have “almost no” weak critical values? 

Little thought shows an immediate obstruction: the size of the graph of F. Clearly 
if gph F C R" x R m has dimension (in some appropriate sense) larger than m, then 
no such result is possible. Hence, at the very least, we should insist that gph A is 
in some sense small in the ambient space R n x R m . 

Luckily, set-valued mappings having small graphs are common in optimization 
and variational analysis literature. Monotone operators make up a fundamental 
example: a mapping F : R n R n is monotone if the inequality (x\ —X2,yi — 2 / 2 ) > 0 
holds whenever the pairs (xj,yj) lie in gph A. Minty [34] famously showed that the 
graph of a maximal monotone mapping on R n is Lipschitz homeomorphic to R 71 , and 
hence monotone graphs can be considered small for our purposes. This property, 
for example, is fundamentally used in [37lf38 j. The most important example of 
monotone mappings in optimization is the subdifferential df of a convex function 
/. More generally, we may consider set-valued mapping arising from variational 
inequalities: 

x g(x) + Nq(x), 

where g is locally Lipschitz continuous and Nq is the normal cone to a closed 
convex subset Q of R n . Such mappings appear naturally in perturbation theory 
for variational inequalities; see flOl . One can easily check that the graph of this 
mapping is locally Lipschitz homeomorphic to gph Nq , and is therefore small in our 
understanding. In particular, we may look at conic optimization problems of the 
form 

min{/(x) : G(x) £ A"}, 

X 

for a smooth function /: R n —> R, a smooth mapping G : R™ —> R m , and a closed 
convex cone K in R m . Standard first order optimality conditions (under an appro¬ 
priate qualification condition) amount to the variational inequality 


0 

0 


V/(x) + VG(x)*A 
-G(x) 


+ N{0}n x K* (%, A), 


where K* is the dual cone of K and the vector A serves as a generalized Lagrange 
multiplier; see [40] for a discussion. Consequently the set-valued mapping on the 
right-hand-side again has a small graph. 

In summary, set-valued mappings with small graphs appear often, and naturally 
so, in optimization problems. Somewhat surprisingly, assuming that the graph is 
small is by itself not enough to guarantee that strong regularity is typical — the 
conclusion that we seek. For instance, there exists a C' 1 -smooth convex function 
g: R —> R so that every number on the real line is a weakly critical value of the 
subdifferential dg. Such a function is easy to construct. Indeed, let /: R —> R 
be a surjective, continuous, and strictly increasing function whose derivative is zero 
almost everywhere (such a function / is described in [46] for example). Observe that 
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/ is nowhere locally Lipschitz continuous, since otherwise the fundamental theorem 
of calculus would imply that that / is constant on some interval — a contradiction. 
On the other hand / is the derivative of the function h(t) : = /(r) dr. The Fenchel 

conjugate h* : R — > R is then exactly the function g that we seek. This example 
is interesting in light of Mignot’s theorem m Theorem 9.65], which guarantees 
that at almost every subgradient, the inverse of the convex subdifferential must be 
single-valued and differentiable, though as we see, not necessarily locally Lipschitz 
continuous. 

In light of this example, we see that even monotone variational inequalities 
can generically fail to be strongly regular. Incidentally, this explains the absence 
of Sard’s theorem from all standard texts on variational inequalities (e.g. [19U20L 
m), thereby deviating from classical mathematical analysis literature where implicit 
function theorems go hand in hand with Sard’s theorem. 

Motivated by optimization problems typically arising in practice, we consider 
semi-algebraic set-valued mappings — those whose graphs can be written as a finite 
union of sets each defined by finitely many polynomial inequalities. See for example 
[27] on the role of such mappings in nonsmooth optimization. In Theorem 13.71 we 
observe that any semi-algebraic mapping F : R n R m , whose graph has dimension 
no larger than m, has almost no weak critical values (in the sense of Lebesgue 
measure). Thus in the semi-algebraic setting, the size of the graph is the only 
obstruction to the Sard-type theorem that we seek. 

Despite its simplicity, both in the statement and the proof, Theorem 13.71 leads 
to a transparent and unified treatment of generic properties of semi-algebraic opti¬ 
mization problems, covering in particular polynomial optimization problems, semi- 
definite programming, and copositive optimization — topics of contemporary inter¬ 
est. To illustrate, consider the family of optimization problems 

min f(x ) + h(G(x ) + y) — v T x, 

X 

where / and h are semi-algebraic functions on R" and R m , respectively, and 
G : R n -> R m is a C 2 -smooth semi-algebraic mapping. Here the vectors v, y serve as 
perturbation parameters. First order optimality conditions (under an appropriate 
qualification condition) then take the form of a generalized equation 


v 

V 


e 


VG(x)*X 
~G(x) . 


+ (df x (dh) x )(x, A), 


where the subdifferentials df and dh are meant in the limiting sense; see e.g. m- 
Observe that the perturbation parameters (v, y ) appear in the range of the set¬ 
valued mapping on the right-hand-side. This set-valued mapping in turn, has a small 
graph. Indeed, the graphs of the subdifferential mappings df and dh always have 
dimension exactly n and m, respectively m Theorem 3.7] (even locally around each 
of their points mi Theorem 3.8], [12l Theorem 5.13]); monotonicity or convexity are 
irrelevant here. Thus the semi-algebraic Sard’s theorem applies. In turn, appealing 
to some standard semi-algebraic techniques, we immediately conclude: for almost 
all parameters (v,y) € R n x R m , the problem admits finitely many composite 
critical points with each one satisfying a strict complementarity condition, a basic 
qualification condition (generalizing that of Mangasarian-Fromovitz) holds, both / 
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and h admit unique active manifolds in the sense of mm, and positivity of a 
second-derivative (of parabolic type) is both necessary and sufficient for second- 
order growth. 

This development nicely unifies and complements a number of earlier results, 
such as the papers [ 431 US ] on generic optimality conditions in nonlinear program¬ 
ming, the study of the complementarity problem [42] , generic strict complementarity 
and nondegeneracy in semi-definite programming mm, as well as the general study 
of strict complementarity in convex optimization [131 , 135] . In contrast, many of our 
arguments are entirely independent of the representation of the senri-algebraic op¬ 
timization problem at hand. It is worth noting that convexity (and even Clarke 
regularity) is of no consequence for us. In particular, our results generalize and 
drastically simplify the main results of [3], where convexity of the semi-algebraic op¬ 
timization problem plays a key role. Though we state our results for senri-algebraic 
problems, they all generalize to the “tame” setting; see m for the definitions. Key 
elements of the development we present here were first reported in m- In partic¬ 
ular Theorem 7.3 in that work sketches the proof of generic minimizing behavior, 
restricted for simplicity to the case of linear optimization over closed senri-algebraic 
sets. 

The outline of the manuscript is as follows. We begin in Section [2] by recording 
some basic notation to be used throughout the manuscript. In Section [3] we recall 
some rudimentary elements of semi-algebraic geometry and prove the senri-algebraic 
Sard theorem for weak critical values. In Section [4] we establish various critical 
point properties of generic senri-algebraic functions, while in Section [5] we refine 
the analysis of the previous section for senri-algebraic functions in composite form. 


2 Basic notation 

We begin by summarizing a few basic notions of variational and set-valued anal¬ 
ysis. Unless otherwise stated, we follow the terminology and notation of [401 [4lj . 
Throughout R n will denote an n-dimensional Euclidean space with inner-product 
(■, ■) and corresponding norm | • j. We denote by B e (x ) an open ball of radius e 
around a point x in R n . 

A set-valued mapping F from R n to R m , denoted F : R“ R m , is a mapping 
taking points in R™ to subsets of R m , with the domain and graph of F being 

donrF := {x £ R n : F{x) / 0}, 

g P h F := {(x,y) £ R" x R m : y £ F(x)}. 

We say that F is finite-valued, when the cardinality of the inrage F(x) is finite for 
every x £ R". 

A mapping F: R n z4 R m is a localization of F around (x,y) £ gph-F if the 
graphs of F and F coincide on a neighborhood of (x,y). The following is the 
central notion we explore. 

Definition 2.1 (Strong regularity and weak critical points). A set-valued mapping 
F: R n R m is C p -strongly regular at (x,y) £ gphE if the inverse F 1 admits a 
C^-snrooth single-valued localization around (y,x). 
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A vector y £ R m is a C p -weak critical value of F if there exists a point x in the 
preimage F~ 1 (y), so that F is not C p -strongly regular at (x,y). 

Observe that y being a weak critical value of F, at the very least, entails that the 
preimage F~ 1 (y) is nonempty. It is instructive to comment on the terms “strong” 
and “weak”. We use these to differentiate strong regularity from the weaker notion 
of metric regularity [25114=0] and the corresponding criticality concept. Note that the 
term “weakly critical” (with no qualifier) refers to the real-analytic version of the 
definition. 

A mapping F: Q —>• Q, where Q is a subset of R m , is C p - smooth if for each point 
x £ Q, there is a neighborhood U of x and a C p -smooth mapping F : R n —> R m that 
agrees with F on Q 0 U. The symbol C u will always mean real analytic. Smooth 
manifolds will play an important role in our work; a nice reference is [28] . 

Definition 2.2 (Smooth manifolds). A subset Ad C R", is a C p manifold of di¬ 
mension r if for each point x £ Ad, there is an open neighborhood U around x 
and a mapping F from R n to a (n — r)-dimensional Euclidean space so that F is 
C p -smooth with the derivative VF(s) having full rank and we have 

AinU = {x £ U : F(x) = 0}. 

In this case, the tangent space to Ad at x is simply the set Tj^{x) := kerVF(x), 
while the normal space to Ad at x is defined by Nj^(x) := range VF(x)*. 

Given a C' 1 -smooth manifold Ai and a mapping F that is C 1 -smooth on Ad, 
we will say that F has constant rank on Ai if the rank of the operator VF(i) 
restricted to 7 m (x), with F being any C 1 -smooth mapping agreeing with F on a 
neighborhood of x in Ad, is the same for all x € Ad. 

3 Semi-algebraic geometry and Sard’s theo¬ 
rem 

Our current work is cast in the setting of semi-algebraic geometry. A semi-algebraic 
set Q C R n is a finite union of sets of the form 

{x € R n : Pi(x) = 0,..., -Pfc(x) = 0, R\ (x) < 0,... ,Ri(x) < 0}, 

where P±,..., and R\,... ,Ri are polynomials in n variables. In other words, Q 
is a union of finitely many sets, each defined by finitely many polynomial equalities 
and inequalities. A map F : R n R m is semi-algebraic if gphP C R n + m is a semi- 
algebraic set. For more details on semi-algebraic geometry, see for example [91117]. 
An important feature of semi-algebraic sets is that they can be decomposed into 
analytic manifolds. Imposing a very weak condition on the way the manifolds fit 
together, we arrive at the following notion. 

Definition 3.1 (Stratification). A C p -stratification of a semi-algebraic set Q is a 
finite partition of Q into disjoint semi-algebraic C p manifolds {Ad*} (called strata ) 
with the property that for each index i, the intersection of the closure of Adj with 
Q is the union of some Ai f's. 
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In particular, we can now define the dimension of any semi-algebraic set Q. 
Definition 3.2 (Dimension of semi-algebraic sets). 

The dimension of a semi-algebraic set Q C R n is the maximal dimension of a 
semi-algebraic C 1 manifold appearing in any Cd-stratification of Q. 

It turns out that the dimension of a semi-algebraic set Q does not depend on any 
particular stratification. It is often useful to refine stratifications. Consequently, the 
following notation becomes convenient. 

Definition 3.3 (Compatibility). Given finite collections {Bi} and {Cj} of subsets of 
R\ we say that {Bi} is compatible with {Cj} if for all Bi and Cj, either R,n Cj = 0 
or Bi C Cj. 

As we have alluded to at the onset, the following is a deep existence theorem for 
semi-algebraic stratifications 021 Theorem 4.8]. 

Theorem 3.4 (Stratifications exist). Consider a semi-algebraic set Q in R n and 
a semi-algebraic map F: Q —>• R m . Let A be a finite collection of semi-algebraic 
subsets of Q and B a finite collection of semi-algebraic subsets of R m . Then there 
exists a C^-stratification A! of Q that is compatible with A and a C u -stratification 
B' of R m compatible with B such that for every stratum A4 € A', the restriction of 
F to M is analytic and has constant rank, and the image F(A4) is a stratum in B'. 

Classically a set U C R n is said to be “generic”, if it is large in some precise 
mathematical sense, depending on context. Two popular choices are that of U being 
full-measure, meaning its complement has Lebesgue measure zero, and that of U 
being topologically generic, meaning it contains a countable intersection of dense 
open sets. In general, these notions are very different. However for semi-algebraic 
sets, the situation simplifies drastically. Indeed, if U C R” is a semi-algebraic set, 
then the following are equivalent. 

• U is dense. 

• U is full-measure. 

• U is topologically generic. 

• The dimension of U c is strictly smaller than n. 

Complements of such sets are said to be negligible. 

The following is the basic tool that we will use. A semi-algebraic finite-valued 
mapping F: R n R m can be decomposed into finitely many C^-smooth single¬ 
valued selections that “cross” almost nowhere. This result is standard: it readily 
follows for example from m Corollary 2.27]. We provide a proof sketch for com¬ 
pleteness. 

Theorem 3.5 (Selections of finite-valued semi-algebraic mappings). Consider a 
finite-valued semi-algebraic mapping G: R n R m . Then there exists an integer 
N, a finite collection of open semi-algebraic sets {Ui}fL 0 in R n , and analytic semi- 
algebraic single-valued mappings 

G\ : Ui —> R"' for i = 0,..., N and j = 1,... ,i 

satisfying: 
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1. (Jj Ui is dense in R n . 

2. For any x G Ui, the image G(x) has cardinality i. 

3. We have the representation 

G(x) = {G? (x) : j = 1,2,, i} whenever x G Ui. 

Proof. Since G is semi-algebraic, there exists an integer N with the property that 
the cardinality of the images G(x) is no greater than N [371 Theorem 4.4], For 
i = 0,..., k, define Ui to be the set of points x G R™ so that that image G(x) has 
cardinality precisely equal to i. A standard argument shows that the sets U are 
semi-algebraic. Stratifying, we replace each U with an open set (possibly empty) 
so that the union of Ui is dense in R”. 

Fix now an index i. By m Corollary 2.27], there exists a dense open subset 
X t of Ui with the property that there exists a semi-algebraic set Y % c R m and a 
semi-algebraic homeomorphism 9 c. gphGj x —> X l x Y) satisfying 

9i{{x } x G(x)) = {x} x Yi for all x G X t . 

Observe that for each i the set 1) has cardinality i. Enumerate the elements of Yi 
by labeling Y t = {y \...., yi}. Define ^r to be the projection n(x, y) = y and for each 
j = 1,...,i set 

G{ (x) = 7T o 9~ 1 (x, yj) for x G Xi. 

Stratifying X t . we may replace Ui by an open dense subset on which all the mappings 
Gl are analytic. The result follows. □ 

In particular, this theorem is applicable for semi-algebraic mappings with “small” 
graphs, since such mappings are hnite-valued almost everywhere m Proposition 4.3]. 

Theorem 3.6 (Finite selections for mappings with small graphs). Suppose that 
the graph of a semi-algebraic set-valued mapping F: R ra R m has dimension no 
larger than m. Then the inverse mapping F _1 : R m R n is finite-valued almost 
everywhere. 

We now arrive at the semi-algebraic Sard theorem — the main result of this 
section. 

Theorem 3.7 (Semi-algebraic Sard theorem for weakly critical values). Consider 
a semi-algebraic set-valued mapping F: R Ti R m satisfying dimgphF 1 < m. Then 
the collection of weakly critical values of F is a negligible semi-algebraic set. More 
precisely, there exists an integer N, a finite collection of open semi-algebraic sets 
mg o in R'\ and analytic semi-algebraic single-valued mappings 

Gl : Ui —> R ra for i = 0,..., N and j = 1,..., i 


satisfying: 

1. Ui Ui is dense in R m . 

2. For any x G Ui, the preimage F~ l {x) has cardinality i. 
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3. We have the representation 


F *(x) = {G\(x) : j = 1,2,...,?'} whenever x £ Ui. 

Proof. Consider the open semi-algebraic sets {Ui}fL 0 along with the single-valued, 
analytic, semi-algebraic mappings Gj : Ui — > R n provided by Theorems 13. 5 1 and 13.61 
Since for any y G Ui, the preimage F~ 1 (y) has cardinality i and we have F~ 1 (y) = 
{a’M-.j = i.2,... ,o, we deduce that the values Gj (y) for j = 1 are all 

distinct. Since G\ are in particular continuous, we deduce that the mapping F _1 
has a single-valued analytic localization around (y,x) for every point x G F~ 1 (y). 
The result follows. □ 

We note that a Sard type theorem for semi-algebraic set-valued mapping with 
possibly large graphs, where criticality means absence of “metric regularity” EEUH], 
was proved in [26J. Since we will not use this concept in the current work, we omit 
the details. 


4 Critical points of generic semi-algebraic func¬ 
tions 

In this section, we derive properties of critical points (appropriately defined) of serni- 
algebraic functions under generic linear perturbations. Throughout, we will consider 
functions / on R n taking values in the extended-real-line R = R U {+oo}. We will 
always assume that such functions are proper, meaning they are not identically 
equal to +oo. The domain and epigraph of / are 

dom / := {x G R n : /(x) < +oo}, 
epi / := {(x,r) G R n x R : r > f(x)}. 

The indicator function of a set Q C R n , denoted 5q, is defined to be zero on Q 
and Too off it. A function /: R n — > R is lower-semicontinuous (Isc ) whenever 
the epigraph epi / is closed. The notion of criticality we consider arises from the 
workhorse of variation analysis, the subdifferential. 

Definition 4.1 (Subdifferentials and critical points). Consider a function /: R n —>• 
R and a point x with f(x) finite. 

1. The proximal subdifferential of / at x, denoted d p f(x), consists of all vectors 
v G R n satisfying 

f(x) > f{x) T (v, x - x) T 0(\x - x\ 2 ). 

2. The limiting subdifferential of / at x, denoted df(x), consists of all vectors v G 
R™ for which there exist sequences Xi G R n and Vi G dpffxf) with (xj, f(xi),Vi) 
converging to (x, f(x),v). 

3. The horizon subdifferential of / at x, denoted d°°f(x), consists of all vectors 
v G R n for which there exist points Xi G R n , vectors Vi G dfixf), and real 
numbers ti \ 0 with (x*, f(xi), tivf) converging to (x, f(x),v). 



We say that x is a critical point of / whenever the inclusion 0 G df(x) holds. 

The subdifferentials d p f and df generalize the notion of a gradient to the non¬ 
smooth setting. In particular, if / is C' 2 -smooth, then dpf and df simply coincide 
with the gradient V/, while if / is convex, both sub differentials coincide with the 
sub differential of convex analysis m Proposition 8.12], The horizon subdifferential 
d°°f plays an entirely different role: it detects horizontal normals to the epigraph 
of / and is instrumental in establishing calculus rules m Theorem 10.6]. For any 
set Q C R", we define the proximal and limiting normal cones by the formulas 
Nq '■= dp$Q and Nq := ddQ , respectively. 

We will show in this section that any semi-algebraic function, subject to a generic 
linear perturbation, satisfies a number of desirable properties around any of its 
critical points. To this end, a key result for us will be that whenever /: R n —>• R 
is semi-algebraic, the graphs of the two subdifferentials d p f and df have dimension 
exactly n m Theorem 3.7]. (This remains true even in a local sense within the 
subdifferential graphs [in Theorem 3.8], [12, Theorem 5.13]). Combining this with 
Theorem ezi we immediately deduce that generic subgradients of a semi-algebraic 
function are not weakly critical. 

This observation, in turn, has immediate implications for minimizers of generic 
semi-algebraic functions, since strong regularity of the subdifferential is closely re¬ 
lated to quadratic growth of the function. To be more precise, recall that x is a 
strong local minimizer of a function / whenever there exist a > 0 and a neighbor¬ 
hood U of x so that 

fix) > fix) + —\x — x\ 2 for each x in U. 

A more stable version of this condition follows. 

Definition 4.2 (Stable strong local minimizers). A point x is a stable strong local 
minimize of a function /: R n —> R if there exist a > 0 and a neighborhood U of 
x so that for every vector v near the origin, there is a point x v (necessarily unique) 
in U, with xq = x , so that in terms of the perturbed functions f v := /(•) — ( v , •), 
the inequality 

f v (x) > f v (x v ) + —\x — x v \ 2 holds for each x in U. 

In [EL Proposition 3.1, Corollary 3.2], the authors show that strong metric 
regularity of the subdifferential at ( x,v ), where x is a local minimizer of f v := 

/(•) — ( v, •), always implies that a: is a stable strong local minimizer of f v . See also 
nna for related results. Thus local minimizers of any semi-algebraic function, for a 
generic linear perturbation parameter, are stable strong local minimizers. Moreover, 
since the subdifferentials all have dimension exactly n and dim(gph5/)\(gph(9 p /) < 
n it easy to see that for a generic vector v, the strict complementarity condition 

v G df{x) => v G ri d p f{x) holds for any x G R" . 

We summarize all of these observations below. 

^This notion appears under the name of uniform quadratic growth for tilt perturbations in where 
it is considered in the context of optimization problems in composite form. 
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Corollary 4.3 (Basic generic properties of semi-algebraic problems). Consider an 
Isc, semi-algebraic function f: R n —> R. Then there exists an integer N > 0 such 
that for a generic vector v £ R n the function 

fv(x) ■= fix) - (v,x) 

has no more than N critical points. In turn, each such critical point x satisfies the 
strict complementarity condition 


0 € ri d p f v {x), 


and if moreover x is a local minimizer of f v , then x is a stable strong local minimizer. 


We will see that by appealing further to semi-algebraic stratifications much more 
is true: any semi-algebraic function, up to a generic perturbation, admits a unique 
“stable active set”. To introduce this notion, we briefly record some notation. To 
this end, working with possibly discontinuous functions /: R n —> R, it is useful to 
consider f -attentive convergence of a sequence x % to a point x, denoted Xi x. In 

this notation 


x 


Xi ->• x and /(xf) f(x). 


An f-attentive localization of df at (x, v) is any mapping T: R n R" that coincides 
on an /-attentive neighborhood of x with some localization of df at (x,v). 

It is often useful to require a kind of uniformity of subgradients. Recall that 
the subdifferential df of an lsc convex function / is monotone in the sense that 
(ui — V 2 ,xi — xf) > 0 for any pairs (aq,ui) and (^ 2 ,^ 2 ) in gph df. Relaxing this 
property slightly leads to the following concept [36l Definition 1.1]. 


Definition 4.4 (Prox-regularity). An lsc function /: R n —> R is called prox-regular 
at x for v, with v £ d p f(x), if there exists a constant r > 0 and an /-attentive 
localization T of df around (x, v) so that T + ri is monotone. 

In particular C 2 -smooth functions and lsc, convex functions are prox-regular at 
each of their points j411 Example 13.30, Proposition 13.34]. In contrast, the negative 
norm function x 1 —> — |x| is not prox-regular at the origin. 

We are now ready to state what we mean by a “stable active set”. This notion 
introduced in m, and rooted in even earlier manuscripts [Tlf6l-[8l lT8l [211 [221118] , 
extends active sets in nonlinear programming far beyond the classical setting. The 
exact details of the definition will not be important for us, since we will immediately 
pass to an equivalent, but more convenient for our purposes, companion concept. 
Roughly speaking, a smooth manifold A4 is said to be “active” or “partly smooth” 
for a function / whenever / varies smoothly along the manifold and sharply off it. 
The parallel subspace of any nonempty set Q, denoted par Q, is the affine hull of 
convQ translated to contain the origin. We also adopt the convention par0 = 0. 

Definition 4.5 (Partial smoothness). Consider an lsc function /: R" — > R and a 
C p manifold Ai. Then / is C p -partly smooth (p > 2) with respect to A4 at x £ A4 
for v £ df(x) if 
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1. (smoothness) / restricted to A4 is C p -smooth on a neighborhood of x. 

2. (prox-regularity) / is prox-regular at x for v. 

3. (sharpness) par d p f(x) = N_m(x). 

4. (continuity) There exists a neighborhood V of v, such that the mapping, 

df(x), when restricted to M, is inner-semicontinuous at x. 

In [16, Proposition 8.4], it was shown that the somewhat involved definition of 
partial smoothness can be captured more succinctly, assuming a strict complemen¬ 
tarity condition v £ ri d p f(x). Indeed, the essence of partial smoothness is in the 
fact that algorithms generating iterates, along with approximate criticality certifi¬ 
cates, often “identify” a distinguished manifold in finitely many iterations; see the 
extensive discussions in [T6lf24] . 

Definition 4.6 (Identifiable manifolds). Consider an lsc function /: R n —>• R. 
Then a set A4 C R n is a C p identifiable manifold of f : R” —>• R at a point x £ M. 
for v £ Of(x) if the set A4 is a C p manifold around x, the restriction of / to 
A4 is (7 p -smooth around x, and A4 has the finite identification property: for any 
sequences Xi -jf x and Vi —> v, with V{ £ df(xi), the points x, must lie in M. for all 

sufficiently large indices i. 

In p]6], Proposition 8.4], the authors showed that for p > 2 the two sophisticated 
looking properties 

1. / is C p -partly smooth with respect to A4 at x for v. 

2. v £ ri d p f(x), 

taken together are simply equivalent to A4 being a C p identifiable manifold of / at 
x for v £ d p f(x). This will be the key observation that we will use with regard to 
partly smooth manifolds. 

It is important to note that identifiable manifolds can fail to exist. For example, 
the function f(x,y) = (|x| + |y|) 2 does not admit any identifiable manifold at the 
origin for the zero subgradient. On the other hand, we will see that such behavior, 
in a precise mathematical sense, is rare. 

Roughly speaking, existence of an identifiable manifold at a critical point opens 
the door to Newton-type acceleration strategies [M1I291I33] and moreover certifies 
that sensitivity analysis of the nonsmooth problem is in essence classical (23[ [32lj. 
To illustrate, we record two basic properties of identifiable manifolds m Proposi¬ 
tions 5.9, 7.2], which we will use in Section [5j 

Theorem 4.7 (Basic properties of identifiable manifolds). Consider an lsc function 
f: R n —>■ R and suppose that M is a C 2 -identifiable manifold around x for v = 0 £ 
d p f(x). Then the following are equivalent 

1. x is a strong local minimizer of f. 

2. x is a strong local minimizer of f + 5 m ■ 

Moreover, equality 

gph df = gph d(f + 5 m ), 
holds on an f-attentive neighborhood of (x,v). 
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Generic existence of identifiable manifolds for semi-algebraic functions will now 
be a simple consequence of stratifiability of semi-algebraic sets. We note that, in 
particular, it shows that convexity is superfluous for the main results of [3|. 

Corollary 4.8 (Generic properties of semi-algebraic problems). Consider an Isc, 
semi-algebraic function f: R n —>■ R. Then there exists an integer N > 0 such that 
for a generic vector v € R n the function 

fv(x) ■= f(x) - ( v,x ) 

has no more than N critical points. Moreover each such critical point x satisfies 

1. (prox-regularity) f v is prox-regular at x for 0. 

2. (strict complementarity) The inclusion 0 6 ri d p f v (x) holds. 

3. (identifiable manifold) f v admits a C u identifiable manifold at x for 0. 

4- (smooth dependence of critical points) The subdifferential df is strongly 
regular at (x,v). More precisely, there exist neighborhoods U of x and V of v 
so that the critical point mapping 

w !->• U n (dfff 1 (w) = {x € U : x is critical for /(■) — (w, •)} 

is single-valued and analytic on V, and maps V onto Ai. 

Moreover if x is a local minimizer of f v then x is in fact a stable strong local 
minimizer of f v . 

Proof. Generic finiteness of critical points and generic strict complementarity was 
already recorded in Corollary 14.31 We now tackle existence of identifiable mani¬ 
folds. To this end, by m Theorem 3.7], the graph of the subdifferential mapping 
df: R" =* R" has dimension n. Consequently, applying Theorem 13.71 we obtain a 
collection of open semi-algebraic sets {Ui}\ L 0 of R”, with dense union, and analytic 
semi-algebraic single-valued mappings 

Gj : Ui —> R n for i = 0,..., k and j = 1,..., i 

with the property that for each v £ Ui the set (df)~ 1 (v) has cardinality i and we 
have the representation 

(d/) _1 (u) = {G{(v) : j = 1,2,...,*}. 

Let B now be a stratification of dom / so that / is analytic on each stratum. Ap¬ 
plying Theorem 13.41 to each Gj , we obtain a stratification A{ of Ui so that G\ is 
analytic and has constant rank on each stratum of Ai of Aj, and so that / is ana¬ 
lytic on the images Gj(Ai). Finding a stratification of Ui compatible with [J jA\, 
we obtain a dense open subset Ui of Ui so that around each point v G Ui there 
exists a neighborhood V of v so that G\ is analytic and has constant rank on V, 
and so that / is analytic on the images Gj(V). Due to the constant rank condition, 
decreasing V further, we may be assured that G\ (V) are all analytic manifolds. 
Taking into account Theorem 13.71 we may also assume that none of the values in Z7j 
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are weakly critical. Consequently for each v G Ui, there exists a sufficiently small 
neighborhood V of v so that the analytic manifold Gj(V) coincides with (<9/) _1 (T) 
on a neighborhood of G\{y). Hence G\(V) is an identifiable manifold at G\{v) for 
v. Finally, appealing to Corollary 14.31 the result follows. □ 


Next we look more closely at second order growth, from the perspective of second 
derivatives. To this end, we record the following standard definition. 

Definition 4.9 (Subderivatives). Consider a function /: R n —>• R and a point x 
with f(x) finite. Then the subderivative of / at x is defined by 

if(x){u) :=li + 

u—yu 

while for any vector v G R n , the critical cone of f at x for v is defined by 


Cf(x,v) := {u G R n : (v,u) = df(x)(u)}. 

The parabolic subderivative of f at x for u G dorri df(x) with respect to w is 


d 2 f (x)(u\w) = lirninf 
t \o 

w^yw 


f(x + tv + ^t 2 w) — f(x) — df(x)(u) 
2 fc 


Some comments are in order. The directional subderivative df(x){u) simply 
measures the rate of change of / in direction u. Whenever / is locally Lipschitz 
continuous at x we may set u = u in the definition. The critical cone Cf(x, v) denotes 
the set of directions u along which the directional derivative at x of the function 
x i-G f(x) — (v,x) vanishes. The parabolic subderivative d 2 f(x)(u\w) measures the 
second order variation of / along points lying on a parabolic arc, and hence the 
name. In particular, when / is C 2 smooth at x, we have 

d 2 f(x)(u\w) = (\7 2 f{x)u,u) + (Vf(x),w). 


This three constructions figure prominently in second-order optimality condi¬ 
tions. Namely, if x is a local minimizer of /, then df{x)(u ) > 0 for all u G R n , and 
we have inf„, g R,n d 2 f {x){u\w) > 0 for any nonzero u G Cf(x, 0). On the other hand, 
deviating from the classical theory, the assumption df(x){u ) > 0 for all u G R” 
along with the positivity inf^gRn d 2 f(x)(u\w) > 0 for any nonzero u G Cf(x, 0), 
guarantees that x is a strong local minimizer of / only under additional regularity 
assumptions on the function /. See for example [5j or m Theorem 13.66] for more 
details. 

We will now see that in the generic semi-algebraic set-up, the situation simplifies 
drastically: the parabolic subderivative completely characterizes quadratic growth 
at a critical point. The key to the development, not surprisingly, is the relationship 
between sub derivatives of a function / and the subderivatives of the restriction of 
/ to an identifiable manifold. 

Theorem 4.10 (First-order sub derivatives and identifiable manifolds). Consider 
an Isc function f: R rt — > R and suppose that f admits a C 2 identifiable manifold 
M at a point x for v G d p f(x). Then for any u G Tj^{x) we have 

df(x)(u) = d(f + S M )(x)(u) = (v,u). 
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Proof. Let g : R n —> R be a C' 2 -smooth function coinciding with f on M near x. 
Standard subdifferential calculus implies 

d p f{x) C d p (f + S M )(x) = d p (g + S M )(x) = V#(x) + N M (x). 

Moreover, one can easily verify d(g + 5m){x){u) = (Vg(x),u) for any u G Tm(x). 
Since by the chain of inclusions above v lies in Vg(x) + IV m(x), we deduce 

d(f + Sm)(x){u) = d(g + d M ){x)(u) = (Vg(x),u) = (v,u). 

Now since identifiable manifolds are partly smooth, we have par d p f(x) = Nj^(x). 
Consequently we deduce 


affidpf (x) = Vg(x) + N M (x). 

In particular, for any u G Tju(x) we have equality (aff d p f(x),u) = (S7g(x),u) = 
(v, u). On the other hand df(x) is the support function of the Frechet sub differential 
df(x) (see [41, Excercise 8.4]), and since / is prox-regular at x for v, we have 
affd p f(x) = aff df(x). We conclude df(x)(u ) = (v,u), as claimed. □ 

As a direct consequence, we deduce that critical cones are simply tangent spaces 
to identifiable manifolds, when the latter exist. The following is an extension of m 
Proposition 6.4]. 

Theorem 4.11 (Critical cones and identifiable manifolds). Consider an Isc function 
f : R" -> R and suppose that f admits a C 2 identifiable manifold M at a point x 
for v G d p f(x). Then the critical cone coincides with the tangent space 

Cf(x,v) = T M (x). 

Proof. The inclusion Cf(x,v) D Tvf(x) is immediate from Theorem 14.101 Con¬ 
versely, consider a vector u G Cf(x,v). Since df{x) is the support function of the 
Frechet sub differential df(x ) (see [4L Exercise 8.4]) and by prox-regularity the sub¬ 
differentials dpf(x) and df(x) coincide near v. we deduce that u lies in Ng p f^(v). 
On the other hand by Theorem 14.71 locally near v. we have equality 

d P f{x) = d p (f + S M )(x) = Vfir(x) + N m (x), 

where g is any C 2 smooth function agreeing with / on A4 near x. Consequently u 
lies in Tj^ t(x), as claimed. □ 

Next we need set analogues of subderivatives - first-order and second-order tan¬ 
gent sets. These are obtained by applying the subderivative concepts to the indicator 
function. More concretely we have the following. 

Definition 4.12 (First order and second order tangent sets). Consider a set ff C R n 
and a point x £ fl. Then the tangent cone to at x is the set 

Tq(x) := {u : 3ti | 0 and Ui u such that x + Uui G fl}, 
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while the critical cone of at x for v is defined by 

Cq(x,v) := T n (x) n u x . 

The second-order tangent set to Q at x for u G Tq(x) is the set 
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Tq(x\u) : = {w : 0 and Wi w such that x + Uu + -t~Wi G Q}. 


One can now easily verify the relationships: 

Tq{x) = dom c15q(x), Cq(x,v) = Cs Q {x,v), T^(x\u) = domd 2 5o(x)('it|-)- 

Next we record an important relationship between projections and identifiable man¬ 
ifolds [32, Proposition 4.5]. Naturally, we say that a set Ai is a C p identifiable 
manifold relative to a set Q at x for v G Nq(x) whenever M. is a C p identifiable 
manifold relative to the indicator function 5q at x for v G < 58q{x). 

Proposition 4.13 (Projections and identihability) . Consider a closed set Q C R n 
and suppose that M is a C p -identifiable manifold (p > 2) at x for v G N^(x). 
Then for all sufficiently small A > 0, the projections Pq and Pm coincide on a 
neighborhood of x + Xv and are C p ~ 1 -smooth there. 

Proposition 4.14 (Second-order tangents to sets with identifiable structure). Sup¬ 
pose that a closed set Q C R n admits an identifiable C 3 manifold at x forv G Nq(x). 
Consider a nonzero tangent u G Tm{x ) and a vector w G Tq(x\u). Then for any 
real e > 0, there exists u G Tm(x) and w G T^ t (x|u) satisfying 

\u — u\ < e and (v,w)>(v,w). 

Proof. By definition of w, there exist numbers ti J, 0 and vectors Wi —> w so that 
the points Xi := x + tiU + \tjwi lie in Q. for each i. By Proposition 14.131 we 
may choose r > 0 satisfying Pq(x + rv) = x, so that Pq coincides with Pm on a 
neighborhood of x + rv, and so that Pq is C 2 -smooth on this neighborhood. Define 
now Zi = Pq{xi + rv). Since Pq is C 2 -smooth on a neighborhood of x + rv, we may 
write Zi = x + tiU + \t\vji for some u G Tm(x) and some Wi converging to a vector 
w G Tj^(x\u). It is standard that the derivative VPm(x) coincides with the linear 
projection onto the tangent space Tm(x), and hence decreasing r we may ensure 
\u — u\ < e. By definition of Z; L then we have the inequality 

|Xi — Zi + rv | < r\v\, 

and hence 

/- , 1 . l2 
{v,Zi - Xi) > —\Xi - Zi\ >0. 

We deduce 

0 < (V,ti(u -U) + ~ti(Wi - Wi)) = -tj(v,Wi - Wi). 

Dividing by ^t 2 and taking the limit the result follows. □ 
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Finally, we arrive at the key relationship between the parabolic subderivative of 
a function and that of its restriction to an identifiable manifold. 

Corollary 4.15 (Second-order subderivatives and identifiability). Suppose that an 
Isc function f: R n —> R admits an identifiable C 3 manifold M at x for 0 £ d p f(x). 
Consider a nonzero vector u £ T m(x) and a vector w. Then for any real e > 0, 
there exists u £ Tm(x) and w satisfying \u — u\ < e and 

d 2 f(x)(u\w) > d 2 (f + 5 M )(x)(u\w). 

Proof. By [T6| Proposition 3.14], the set 1C := gph (/ + 5m) is a C 3 identifiable 
manifold relative to epi / at ( x,f{x )) for (u, —1). Moreover by Theorem 14.101 we 
have 

Tjc(x) = {(u, a) : u £ Tm(x) and a = df(x)(u)}. 

Define ft := df{x){u). Then by [Tl, Example 13.62], equality 

epi d 2 f (x)(u\-) = T 2 pif ((x,f(x))\(uJ)), 

holds. Define r := d 2 f(x){u\w). Applying Proposition 14.141 we deduce that there 
exist (u,j3) £ Tic(x,f(x)) and (w,r) £ T^((x, f(x))\(u, fi)) satisfying 

\(u, fi) - (u, fi)\ < e and ((0,-1), (w, r)) > ((0,-1), (w, f)). 

Clearly ft = df{x)[u) and r = d 2 (f + <5_/vi)(x)(ii|u;). We deduce 

d 2 f(x)(u\w) > d 2 (f + 5 m ){x){u\w), 


as claimed. □ 

We now arrive at the main result of this section. 

Theorem 4.16 (Generic properties of semi-algebraic problems). Consider an Isc, 
semi-algebraic function f: R n —>• R. Then there exists an integer N > 0 such that 
for a generic vector v £ R n the function 

fv{x) '■= f{x) - (v,x) 

has no more than N critical points. Moreover each such critical point x satisfies 

1. (prox-regularity) f v is prox-regular at x for 0. 

2. (strict complementarity) The inclusion 0 £ ri d p f v (x ) holds. 

3. (identifiable manifold) f v has an identifiable manifold M at x for 0. 

4- (smooth dependence of critical points) The subdifferential df is strongly 
regular at (x,v). More precisely, there exist neighborhoods U of x and V of v 
so that the critical point mapping 

w i —} U n (df)~ L (w) = {x £ U : x is critical for /(•) — (w, •)} 
is single-valued and analytic on V, and maps V onto Ml. 
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Moreover the following are all equivalent 
(i) x is a local minimizer of /„. 

( ii ) x is a stable strong local minimizer of f v . 

(in) The inequality 

inf d 2 f v (x)(u\w) > 0 holds for all 0 / u € Cf(x,v). 

uieR" 

(iv) The inequality 

inf d 2 (f v + > 0 holds for all 0 ^ u £ Tj^(x). 

ujgR" 

Proof. In light of Corollary 14.81 we must only argue the claimed equivalence of the 
four properties. To this end, observe that for generic v, the equivalence ([7]) (fTTj) 
was established in Corollary 14.81 On the other hand, Theorem 14.71 shows that ([77]) 
is equivalent to x being a strong local minimizer of f v on Mi, which in turn for 
classical reasons is equivalent to (17771) . Note also that the implication MU II =)> (1771) is 
obvious from Theorem 14.111 Thus we must only show the implication (|777j) =8 (1mD , 
but this follows immediately from Corollary 14.151 □ 

Note that property (17771) in the theorem above involves only classical analysis. 


5 Composite semi-algebraic optimization 

In this section, we consider composite optimization problems of the form 

min f(x) + h(G(x )), 

where /: R n —» R and h : R m —>• R are lsc functions and G : R” —>• R is C' 2 -smooth. 
A prime example is the case of smoothly constrained optimization; this is the case 
where h is the indicator function of a closed set. We call a point x € R n composite 
critical for the problem if there exists a vector 

A € dh(G(x )) satisfying — VG(x)*X € df(x). 

Whenever the optimality condition above holds, we call A a Lagrange multiplier 
vector and the tuple (x, A) a composite critical pair. The multiplier A is sure to be 
unique under the condition: 

(1) par dh(G(x)) P|[VG(x)*] _1 par df(x) = {0}. 

Indeed, this is a direct analogue of the linear independence constraint qualification 
in nonlinear programming. 

In general, the notion of composite criticality is different from criticality (as 
defined in the previous sections) for the function f + ho G. If x is a critical point of 
/ + h o G, then x is composite critical only under some additional condition, such 
as the basic constraint qualification 

(2) d°°h(G(x)) plfVGOr)*]-^ 00 /^) = {0}. 
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This qualification condition is a generalization of the Mangasarian-Fromovitz con¬ 
straint qualification in nonlinear programming and is in particular implied by dU); 
see the discussion in [39] for more details. Conversely, if x is a composite critical 
point and both / and h are subdifferentially regular [411 Definition 7.25] (as is the 
case when / and h are convex), then x is also a critical point of the function f+hoG. 

In this section, we consider properties of composite critical points for generic 
composite semi-algebraic problems. To this end, we will assume that /, G , and h 
are all semi-algebraic and we will consider the canonically perturbed problems: 

min f[x) + h(G(x) + y) — ( v,x ). 


Then composite criticality is succinctly captured by the generalized equation 


(3) 


v 

y 


e 


VG(x)*X 

. ~G(x) . 


+ (df x (dh) ^ (x, A). 


The path to generic properties is now clear since the perturbation parameters (v, y ) 
appear in the range space of a semi-algebraic set-valued mapping having a small 
graph. 

Before we proceed, we briefly recall that subderivatives admit a convenient cal¬ 
culus m Exercise 13.63] for the composite problem. In what follows, for any 
C' 2 -smooth mapping G(x) = {g\(x),.. ., g m {x)) we use the notation 

V 2 G(x)[u,u\ = ({V 2 gi(x)u,u),... ,(V 2 g m (x)u,u)). 


Theorem 5.1 (Calculus of subderivatives). 

Consider a C 2 -smooth mapping G : R n —> R m and Isc functions f: R n —> R and 
h: R m —> R. Suppose that a point x satisfies the constraint qualification 


9 00 MG(x))n[VG(x)*]- 1 9 00 /(x) = {0}. 


Then the equality 


d(f + h o G)(x)(u) = df(x)(u) + dh(G(x))(VG(x)u)) 


holds. Moreover for any u with d(f + ho G)(x)(u) finite, we have 


d 2 {f + hoG)(x)(u\w) = d 2 f(x)(u\w) + d 2 h(G{x))(\7G(x)u V 2 G[u,u} + VG{x)w 


We are now ready to prove the main result of this section. Note that if for almost 
every v, a property is valid for almost every y (with the v fixed), then by Fubini’s 
theorem the said property holds for almost every pair (v,y). The same holds with 
v and y reversed. We will use this observation implicitly throughout. 


Theorem 5.2 (Generic properties of composite optimization problems). Consider a 
C 2 -smooth semi-algebraic mapping G: R n —y R m and Isc semi-algebraic functions 
f: R n -► R and h: R m ->• R. Define now the family of composite optimization 
problems P(v, y) by 

min f v (x) + h(G y (x)), 
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under the perturbations f v (x) := f(x) — (v,x) and G y (x ) = G(x) + y. Then for 
almost every y £ R m , the qualification conditions 

(4) spand°°h(G y (x)) P|[VG(x)*] _1 span3 00 / t ,(x) = {0}, 

(5) par dh{G y {x)) |^|[VG(:c)*] -1 par df v (x) C {0}, 

hold for any x, for which f v (x) and h(G y (x)) are finite. Moreover there exists an 
integer N > 0 such that for a generic collection of parameters (v,y) £ R n x R m ; the 
problem P(v, y) has at most N composite critical points, and for any such composite 
critical point x of P(v,y), there exist a unique Lagrange multiplier vector 

A £ dh(G y (x )) satisfying — VG'(x)*A £ df v (x). 

Moreover, defining w := —VG(x)*X, the following are true. 

1. (prox-regularity) f v is prox-regular at x for w and h is prox-regular at G y (x) 
for A. 

2. (strict-complementarity) The inclusions 

A £ ri dph(G y (x)) and w£rid p f v (x) hold. 

3. (identifiable manifold) f v admits a C u identifiable manifold M at x for w 
and h admits a C u identifiable manifold /C at G y (x) for A. 

4- (nondegeneracy) The constraint qualification (nondegeneracy condition) 
Njc(G y (x)) n [VG(x)*] -1 iV_A/i(x) = {0} holds. 


5. (smooth dependence of critical triples) The mapping 


(v,y) ^ {(®, A) 


the pair (x, A) is composite critical for P(v, 



admits a single-valued analytic localization around (v,y,x, A). 
Moreover the following are equivalent. 


(i) x is a local minimizer of P(v,y). 

(ii) x is a strong local minimizer of P(v,y). 
(in) The inequality 


d 2 f(x)(u\z) + d 2 h(G y (x)) (\7G(x)u V 2 G(x)[u, u] + VG(r)zj >0 

holds for all nonzero u £ Cf(x,w) n [\7G(x)]~ 1 Ch(G y (x), A) and all 2 £ R". 
(iv) The inequality 

d 2 (f+5 M )(x)(u\z)+d 2 (h+5 K )(G y (x))(yG(x)u V 2 G(r)[u,«]+VG(i)z) >0 
holds for all nonzero u £ Tju(x) fl [VG(x)] _1 T^(G y (x)) and all z £ R n . 
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Proof. First applying j4 s Lemma 8] and Theorem 13.41 we obtain a C stratification 
{Ai} of dom / and a C u stratification {Bj} of dom/i having the property that / is 
C^-smooth on each Ai and h is C^-smooth on each Bj, and so that 


d°°f(x) U par <9/(a;) C Na, (x) and d°°h(z ) Upan9 h(z) C Nb 3 (z) 

for any x £ Ai and z £ Bj. For fixed indices i and j, the standard Sard’s theorem 
implies that for almost every y £ R m , the restriction of G y to Ai is transverse to 
Bj, that is for any x £ Ai with G y {x ) £ Bj we have 

N Bj (G y (x)) n [VG(x)*]- 1 ^(x) = {0}. 


Since there are finitely many indices i and j, the claimed qualification conditions 
dU) and © follow. 

Define now the set-valued mapping 1: R n x R m R n x R m by 


l(x, A) 


VG(z)*A 
-G(x) _ 


+ (df x (dh) (x, A). 


Observe {v,y) £ I(x,X) if and only if (x, A) is a composite critical pair for P(y,y). 
It is easy to see, in turn, that gphZ is C 1 diffeomorphic to gph df x gph((9/i)^ 1 , 
and hence by III Theorem 3.7] has dimension n + m. Applying the semi-algebraic 
Sard’s theorem for weakly critical values lTheorem l3.7p . we deduce that there exists 
an integer IV > 0 such that for generic parameters (v,y), the problem P{v,y) has 
at most N composite critical points x. Moreover for any composite critical point 
x of P(v,y), the Lagrange multiplier vector A is unique for almost every (v,y) by 
inclusion (|5|). 

We now prove the strict complementarity claim. To this end, define the mapping 


Zp(x, A) 


'VG(x)* A' 
~G(x) . 


+ (rid p f x (ri d p h) ^(x,A). 


Clearly the inclusion gphZj, C gphZ holds, and by what we have already proved 
both mappings I p and I are finite valued almost everywhere. We now claim that 
gphZp is dense in gphZ. To see this, fix a pair (y,y) £ Z(x, A). Equivalently we 
may write 


0 = w + S7G(x)* A, for some w £ df v (x) and A £ dh(G y (x)). 

By definition of the limiting subdifferential, there are sequences (xk,Uk) (x, w+v) 
in gph (ridpf) and ( z k ,\k) (G y (x),X) in gph(ri d p h). Defining := (u k - (w + 
v)) + (yG(x k )*X k — 'VG(x)*X) and a k := z k — G y (x k ) it is easy to verify the inclusion 

(v + 7 k,U + otk) £ T p {x k , X k ). 

Hence gphZ p is dense in gphZ. Since both Z” 1 and Xff 1 are senri-algebraic and 
finite almost everywhere, it follows immediately that Z _1 and Z^ 1 agree almost 
everywhere on R” x R m . This establishes the strict complementarity claim [2j 
Moving on to existence of identifiable manifolds, applying Theorem 13.71 to the 
mapping I p , we deduce that there exists an integer N, a finite collection of open 
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semi-algebraic sets {U}fL 0 in R n x R m , and analytic semi-algebraic single-valued 
mappings 

Ej : Ui -»• R n x R m for * = 0,. .., N and j = 1,..., * 


satisfying: 

1. |J - Uj, is dense in R n x R m . 

2. For any (v,y) G Ui, the image I~ 1 (v,y) has cardinality i. 

3. We have the representation 

lp l (v, V) = {E{ (v, y) : j = 1, 2,..., i} whenever (v, y) G £/,. 

Let X\ (v, y) denote the composition of Ej with the projection (x, X) ^ x and 
let F- (v, y ) := G(X- (u, y)) + y. Applying Theorem 13.41 to each X- and F- , we may 
find a dense open subset U of U so that 

• / is analytic on X? ( Ui ) and h is analytic on F?(Ui). 

• X{ and F{ are analytic and have constant rank on Ui 

Let (x, A) be such that X\ (v, y) = x and so that (x, A) is a composite critical 
pair for P(v,y). Define also w := — VG(x)*A. Then due to the constant rank, there 
exists a neighborhood W of (v, y) so that X\ (W) and F- {W) are analytic manifolds. 
We claim that Xj (W) is an identifiable manifold relative to f v at x for w and that 
FliW) is an identifiable manifold relative to h at G y (x ) for A. 

To see this, consider sequences (xk,Wk) —> (x,w) in gph.df v and (zk,Xk) 
(G y (x), A) in gph<9/i. Defining 7 ^ := (wk — w) + (VG(xfc)*Aj — VG(x)*A) and 
ak := Zk — G y {xk) we have the inclusion 

(v + 7 fc, V + Oik) G l p {x k , A k ). 

Hence for all large indices k equality 

E j(v + lk,y + Oi k ) = {x k ,X k) 

holds. We deduce for sufficiently large k the inclusion Xk G X- {W). Hence Xj (W) 
is indeed identifiable relative to f v at x for w. Moreover, we have Zk = F- (v + 
7 k,y + otk) G F-(W) for all large k. We conclude that F-(W) is identifiable relative 
to h at G y (x) for A, as claimed. The nondegeneracy claim is a simple consequence 
of the construction and the classical Sard’s theorem. Finally the four equivalent 
properties are immediate from Theorems 14.161 and 15.11 

□ 

Note that Theorem 15.21 with h = 0 and G = I reduces to Theorem 14.161 It is 
interesting to reinterpret Theorem l5.2l in the convex setting. To this end, recall that 
for any convex function /: R n —» R, the Fenchel conjugate f*: R n —> R is defined 
by 

f*(u) := sup { (u, x) - f(x)}, 
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and the relationship df* = ( df)~ l holds. 

Fix now a linear mapping A: R n —>• R m and lsc convex functions /: R” —> R 
and h: R m —> R. Within the Fenchel framework, we consider the family of primal 
optimization problems given by 

inf f(x ) + h(Ax + y) — (v, x), 

X 

and associate with them the dual problems 

sup — h* (u ) — f* (v — A*u ) + (y, u). 

U 

Then the primal problem is feasible whenever y lies in the set 

Y := dom h — j4(dom/), 
and the dual is feasible whenever v lies in 


V := dom f* + A*(dom h*). 


Standard Fenchel duality then asserts that for y in the interior of Y, the primal and 
dual optimal values are equal and the dual is attained when finite. Assuming in 
addition that v lies in the interior of V, optimality is characterized by the generalized 
equation 


v 

y. 


£ 


A*u 

—Ax 


+ [df x dh*^j (x, u ). 


This is precisely an instance of the variational inequality ([3]) in a convex setting. 
Assuming now that / and h are semi-algebraic, and applying Theorem 15.21 we 
deduce that for generic parameters (v, y), if the primal and dual problems are feasible 
then the interiority conditions hold, and both the primal and the dual admit at most 
one minimizer. Moreover for any such minimizers x and u, strict complementarity 
holds for the primal and the dual, identifiable manifolds exist for both problems, 
both objectives grow quadratically around x and u, respectively, and the minimizers 
x and u jointly vary analytically with the parameters (v,y). 
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